16 research outputs found

    Integrating Threat Modeling and Automated Test Case Generation into Industrialized Software Security Testing

    Full text link
    Industrial Internet of Things (IIoT) application provide a whole new set of possibilities to drive efficiency of industrial production forward. However, with the higher degree of integration among systems, comes a plethora of newthreats to the latter, as they are not yet designed to be broadly reachable and interoperable. To mitigate these vast amount of new threats, systematic and automated test methods are necessary. This comprehensiveness can be achieved by thorough threat modeling. In order to automate security test, we present an approach to automate the testing process from threat modeling onward, closing the gap between threat modeling and automated test case generation.Comment: 3 pages, 1 figure, Central European Cybersecurity Conference 2019 (CECC2019), Munic

    The Pipeline for the Continuous Development of Artificial Intelligence Models -- Current State of Research and Practice

    Full text link
    Companies struggle to continuously develop and deploy AI models to complex production systems due to AI characteristics while assuring quality. To ease the development process, continuous pipelines for AI have become an active research area where consolidated and in-depth analysis regarding the terminology, triggers, tasks, and challenges is required. This paper includes a Multivocal Literature Review where we consolidated 151 relevant formal and informal sources. In addition, nine-semi structured interviews with participants from academia and industry verified and extended the obtained information. Based on these sources, this paper provides and compares terminologies for DevOps and CI/CD for AI, MLOps, (end-to-end) lifecycle management, and CD4ML. Furthermore, the paper provides an aggregated list of potential triggers for reiterating the pipeline, such as alert systems or schedules. In addition, this work uses a taxonomy creation strategy to present a consolidated pipeline comprising tasks regarding the continuous development of AI. This pipeline consists of four stages: Data Handling, Model Learning, Software Development and System Operations. Moreover, we map challenges regarding pipeline implementation, adaption, and usage for the continuous development of AI to these four stages.Comment: accepted in the Journal Systems and Softwar

    Data Pipeline Quality: Influencing Factors, Root Causes of Data-related Issues, and Processing Problem Areas for Developers

    Full text link
    Data pipelines are an integral part of various modern data-driven systems. However, despite their importance, they are often unreliable and deliver poor-quality data. A critical step toward improving this situation is a solid understanding of the aspects contributing to the quality of data pipelines. Therefore, this article first introduces a taxonomy of 41 factors that influence the ability of data pipelines to provide quality data. The taxonomy is based on a multivocal literature review and validated by eight interviews with experts from the data engineering domain. Data, infrastructure, life cycle management, development & deployment, and processing were found to be the main influencing themes. Second, we investigate the root causes of data-related issues, their location in data pipelines, and the main topics of data pipeline processing issues for developers by mining GitHub projects and Stack Overflow posts. We found data-related issues to be primarily caused by incorrect data types (33%), mainly occurring in the data cleaning stage of pipelines (35%). Data integration and ingestion tasks were found to be the most asked topics of developers, accounting for nearly half (47%) of all questions. Compatibility issues were found to be a separate problem area in addition to issues corresponding to the usual data pipeline processing areas (i.e., data loading, ingestion, integration, cleaning, and transformation). These findings suggest that future research efforts should focus on analyzing compatibility and data type issues in more depth and assisting developers in data integration and ingestion tasks. The proposed taxonomy is valuable to practitioners in the context of quality assurance activities and fosters future research into data pipeline quality.Comment: To be published by The Journal of Systems & Softwar

    Towards Automatic Generation of Amplified Regression Test Oracles

    Full text link
    Regression testing is crucial in ensuring that pure code refactoring does not adversely affect existing software functionality, but it can be expensive, accounting for half the cost of software maintenance. Automated test case generation reduces effort but may generate weak test suites. Test amplification is a promising solution that enhances tests by generating additional or improving existing ones, increasing test coverage, but it faces the test oracle problem. To address this, we propose a test oracle derivation approach that uses object state data produced during System Under Test (SUT) test execution to amplify regression test oracles. The approach monitors the object state during test execution and compares it to the previous version to detect any changes in relation to the SUT's intended behaviour. Our preliminary evaluation shows that the proposed approach can enhance the detection of behaviour changes substantially, providing initial evidence of its effectiveness.Comment: 8 pages, 1 figur

    How Do Deep-Learning Framework Versions Affect the Reproducibility of Neural Network Models?

    No full text
    In the last decade, industry’s demand for deep learning (DL) has increased due to its high performance in complex scenarios. Due to the DL method’s complexity, experts and non-experts rely on blackbox software packages such as Tensorflow and Pytorch. The frameworks are constantly improving, and new versions are released frequently. As a natural process in software development, the released versions contain improvements/changes in the methods and their implementation. Moreover, versions may be bug-polluted, leading to the model performance decreasing or stopping the model from working. The aforementioned changes in implementation can lead to variance in obtained results. This work investigates the effect of implementation changes in different major releases of these frameworks on the model performance. We perform our study using a variety of standard datasets. Our study shows that users should consider that changing the framework version can affect the model performance. Moreover, they should consider the possibility of a bug-polluted version before starting to debug source code that had an excellent performance before a version change. This also shows the importance of using virtual environments, such as Docker, when delivering a software product to clients

    A Framework for Defect Prediction in Specific Software Project Contexts

    No full text
    Part 8: QualityInternational audienceSoftware defect prediction has drawn the attention of many researchers in empirical software engineering and software maintenance due to its importance in providing quality estimates and to identify the needs for improvement from project management perspective. However, most defect prediction studies seem valid primarily in a particular context and little concern is given on how to find out which prediction model is well suited for a given project context. In this paper we present a framework for conducting software defect prediction as aid for the project manager in the context of a particular project or organization. The framework has been aligned with practitioners’ requirements and is supported by our findings from a systematical literature review on software defect prediction. We provide a guide to the body of existing studies on defect prediction by mapping the results of the systematic literature review to the framework
    corecore